{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "ff7e31df",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/vector_stores/Neo4jVectorDemo.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "80018bc3-f3fe-47ae-a579-f837fdf728a0",
   "metadata": {},
   "source": [
    "# Neo4j vector store"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "5ae79640",
   "metadata": {},
   "source": [
    "If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a256f772",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-vector-stores-neo4jvector"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "74c7850b",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7e67be7b-f135-4feb-827e-6585f86c4ed2",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import openai\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"OPENAI_API_KEY\"\n",
    "openai.api_key = os.environ[\"OPENAI_API_KEY\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "086f3065-3072-4588-82cb-2a852019451c",
   "metadata": {},
   "source": [
    "## Initiate Neo4j vector wrapper"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "910d6b13-576e-47b1-96dd-eacbfe10fa0b",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.vector_stores.neo4jvector import Neo4jVectorStore\n",
    "\n",
    "username = \"neo4j\"\n",
    "password = \"pleaseletmein\"\n",
    "url = \"bolt://localhost:7687\"\n",
    "embed_dim = 1536\n",
    "\n",
    "neo4j_vector = Neo4jVectorStore(username, password, url, embed_dim)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2c9c4515-982d-4f78-b099-f70eabfae60c",
   "metadata": {},
   "source": [
    "## Load documents, build the VectorStoreIndex"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "348a4c97-bbf9-4eb1-8669-079c54588fbf",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n",
    "from IPython.display import Markdown, display"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "d9cd108b",
   "metadata": {},
   "source": [
    "Download Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "71729c84",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--2023-12-14 18:44:00--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt\n",
      "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.110.133, ...\n",
      "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.\n",
      "HTTP request sent, awaiting response... 200 OK\n",
      "Length: 75042 (73K) [text/plain]\n",
      "Saving to: ‘data/paul_graham/paul_graham_essay.txt’\n",
      "\n",
      "data/paul_graham/pa 100%[===================>]  73,28K  --.-KB/s    in 0,03s   \n",
      "\n",
      "2023-12-14 18:44:00 (2,16 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "!mkdir -p 'data/paul_graham/'\n",
    "!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aecb970b-7d52-4b0b-8799-605187a01dd3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load documents\n",
    "documents = SimpleDirectoryReader(\"./data/paul_graham\").load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8f2ee4d4-addc-49cf-b7ae-0d6146e0f717",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import StorageContext\n",
    "\n",
    "storage_context = StorageContext.from_defaults(vector_store=neo4j_vector)\n",
    "index = VectorStoreIndex.from_documents(\n",
    "    documents, storage_context=storage_context\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "59b91a75-0754-4ded-af05-adceda3557d8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<b>At Interleaf, they added a scripting language inspired by Emacs and made it a dialect of Lisp. They were looking for a Lisp hacker to write things in this scripting language. The author of the text worked at Interleaf and mentioned that their Lisp was the thinnest icing on a giant C cake. The author also mentioned that they didn't know C and didn't want to learn it, so they never understood most of the software at Interleaf. Additionally, the author admitted to being a bad employee and spending much of their time working on a separate project called On Lisp.</b>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "query_engine = index.as_query_engine()\n",
    "response = query_engine.query(\"What happened at interleaf?\")\n",
    "display(Markdown(f\"<b>{response}</b>\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d5795fc-f517-47a1-ac8a-b5299860e5cd",
   "metadata": {},
   "source": [
    "## Hybrid search\n",
    "\n",
    "Hybrid search uses a combination of keyword and vector search\n",
    "In order to use hybrid search, you need to set the `hybrid_search` to `True`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "49e737d4-8945-469f-a167-37ec8537b82f",
   "metadata": {},
   "outputs": [],
   "source": [
    "neo4j_vector_hybrid = Neo4jVectorStore(\n",
    "    username, password, url, embed_dim, hybrid_search=True\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a17ead34-20d2-4610-9167-9d73675f4d56",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<b>At Interleaf, they added a scripting language inspired by Emacs and made it a dialect of Lisp. They were looking for a Lisp hacker to write things in this scripting language. The author of the essay worked at Interleaf but didn't understand most of the software because he didn't know C and didn't want to learn it. He also mentioned that their Lisp was the thinnest icing on a giant C cake. The author admits to being a bad employee and spending much of his time working on a contract to publish On Lisp.</b>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "storage_context = StorageContext.from_defaults(\n",
    "    vector_store=neo4j_vector_hybrid\n",
    ")\n",
    "index = VectorStoreIndex.from_documents(\n",
    "    documents, storage_context=storage_context\n",
    ")\n",
    "query_engine = index.as_query_engine()\n",
    "response = query_engine.query(\"What happened at interleaf?\")\n",
    "display(Markdown(f\"<b>{response}</b>\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e30dd545-7a0e-44a5-aeb7-3eef9312c538",
   "metadata": {},
   "source": [
    "## Load existing vector index\n",
    "\n",
    "In order to connect to an existing vector index, you need to define the `index_name` and `text_node_property` parameters:\n",
    "\n",
    "- index_name: name of the existing vector index (default is `vector`)\n",
    "- text_node_property: name of the property that containt the text value (default is `text`)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "872deaed-2fc8-48ba-be52-aae9b260508a",
   "metadata": {},
   "outputs": [],
   "source": [
    "index_name = \"existing_index\"\n",
    "text_node_property = \"text\"\n",
    "existing_vector = Neo4jVectorStore(\n",
    "    username,\n",
    "    password,\n",
    "    url,\n",
    "    embed_dim,\n",
    "    index_name=index_name,\n",
    "    text_node_property=text_node_property,\n",
    ")\n",
    "\n",
    "loaded_index = VectorStoreIndex.from_vector_store(existing_vector)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9e286e74-6c3c-43f6-a887-70016740a4f8",
   "metadata": {},
   "source": [
    "## Customizing responses\n",
    "\n",
    "You can customize the retrieved information from the knowledge graph using the `retrieval_query` parameter.\n",
    "\n",
    "The retrieval query must return the following four columns:\n",
    "\n",
    "* text:str - The text of the returned document\n",
    "* score:str - similarity score\n",
    "* id:str - node id\n",
    "* metadata: Dict - dictionary with additional metadata (must contain `_node_type` and `_node_content` keys)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3c418367-ac82-4a53-9963-9cd6c190bd35",
   "metadata": {},
   "outputs": [],
   "source": [
    "retrieval_query = (\n",
    "    \"RETURN 'Interleaf hired Tomaz' AS text, score, node.id AS id, \"\n",
    "    \"{author: 'Tomaz', _node_type:node._node_type, _node_content:node._node_content} AS metadata\"\n",
    ")\n",
    "neo4j_vector_retrieval = Neo4jVectorStore(\n",
    "    username, password, url, embed_dim, retrieval_query=retrieval_query\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ef46046e-8c71-47ec-a948-96201a48a81e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<b>Interleaf hired Tomaz.</b>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "loaded_index = VectorStoreIndex.from_vector_store(\n",
    "    neo4j_vector_retrieval\n",
    ").as_query_engine()\n",
    "response = loaded_index.query(\"What happened at interleaf?\")\n",
    "display(Markdown(f\"<b>{response}</b>\"))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
